May 19, 2020

Packages for This Presentation

In order of occurence:

  • rgdal (reading in shapefiles)
  • fields (color scheme)
  • geosphere (calculating distance between points)
  • tidyverse (mostly for ggplot)
  • maps (for clean basemap)

What is "geospatial" data?

  • Any data with that contains geographic coordinates (GPS)
  • Can be tables, shapefiles, and even photographs
  • Example: looking at maps to see the most recent earthquake

Geospatial data can be tricky!

  • Locations are usually given by two distinct numbers


  • To designate direction: sometimes positive or negative numbers, sometimes positive numbers with N/S or E/W next to them!

Geospatial data can be tricky!


- To designate location: decimal degrees (DD), degrees minutes seconds (DMS), degrees minutes.minutes (DM.M), and sometimes even meters (what?!)

So… geospatial data can be tricky, and a bit sticky. But once you're used to it, it can be oh so satisfying!

Geospatial Data and Science

  • Tracking animals: Where do they go? How do they interact with their environment? (today)
  • Tracking flight patterns: Can an airline add more flights to their schedule?
  • Predicting the next earthquake from previous earthquake locations
  • and so much more!

Imagine…

You're a marine biologist, studying fish movement patterns.

Research questions

  • Are bottom-feeding fish spending time near a wastewater outfall?
  • How close are these fish to the outfall pipe?
Images courtesy of Dr. Larry Allen (CSU Northridge)

Implications:

  • Fish could be exposed to harmful contaminants
  • Humans that eat these fish could also be exposed













Your Study Site

A wastewater outfall ~5 mi offshore of Huntington Beach, CA

A Quick Intro into Acoustic Telemetry

  • Fish are tagged with transmitters
  • Transmitters send out a sound that's heard by underwater
    acoustic receivers
  • Receivers record the DateTime, receiver location, and
    tag ID # of the fish that sent the signal
  • If 3+ receivers are close together and hear the same fish
    we can get the location of the fish too

Start with the Basics

In your toolbox:

  • shapefile with wastewater outfall location
  • shapefile with water depths (bathymetry)
  • shapefile with US states (land masses)
  • tagged fish locations

Wait, what's a shapefile?

  • A set of geospatial files (.shp, .shx, .dbf, .prj) that can map themselves
  • The set must stay together, or the file breaks, but you only open the .shp

Start with the Basics

Step 1. Load required packages.

library(rgdal)

Step 2. Load shapefiles.

outfall <- readOGR("sample-data/shapefiles/outfall-reproject.shp",
                   verbose=FALSE)
depths <- readOGR("sample-data/shapefiles/all-depths.shp", verbose=FALSE)
states <- readOGR("sample-data/shapefiles/us-states.shp", verbose=FALSE)

Start with the Basics

Step 3. Check coordinate systems.

proj4string(outfall)
## [1] "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
proj4string(depths)
## [1] "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
proj4string(states)
## [1] "+proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0"

Oh no! outfall and depths are both "WGS84" (decimal degrees), but states is "NAD83" (meters)!

Start with the Basics

For all the layers to plot nicely on top of each other, they need to all be in the same coordinate system.

Step 4. Convert coordinate systems.

states <- spTransform(states, proj4string(outfall))

Check that the projection has properly converted

proj4string(states)
## [1] "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"

Mapping to Validate Data

Now, we can quickly plot the data and see how it looks so far.

plot(depths, col="blue", lty=2)
plot(outfall, col="black", lwd=2, add=TRUE)
plot(states, col="gray", add=TRUE)

Loading in Fish Locations

Step 5: Load in fish and receiver GPS points.

receiver_locations <- read.csv("sample-data/csv/receiverlocs.csv")
fish_locations <- read.csv("sample-data/csv/location_data.csv")
head(receiver_locations, n=2)
##         Y         X Station   VR2W
## 1 33.5927 -118.0067  OCSD_1 129778
## 2 33.5909 -118.0018  OCSD_2 129789
head(fish_locations, n=2)
##   X               times       LON      LAT
## 1 1 2017-06-16 16:16:00 -117.9986 33.58484
## 2 2 2017-06-16 16:17:00 -117.9982 33.58450

Re-Formatting

For this plot, I want to be kind of fancy.
I want the fish locations to be color-coded by time.

# First, make sure date/time are in POSIX format
fish_locations$times <- as.POSIXct(fish_locations$times, tz="UTC",
                                   format="%Y-%m-%d %H:%M:%S")
# Then, convert to numeric so we can add a color scheme
fish_locations$numeric_time <- as.numeric(fish_locations$times)

Time to Set Up a Pretty Plot

Load in color scheme package.

library(fields)

Set plot boundaries (trial and error)

xlimits<-c(-118.076750, -117.953354) # only interested in this range of longitudes
ylimits<-c(33.558575, 33.637742) # and this range of latitudes

The Final Plot Code

# Set up empty plot
plot(receiver_locations$X, receiver_locations$Y, xlim=xlimits, 
     ylim=ylimits, xlab="Longitude (DD)", ylab="Latitude (DD)", type="n") 

# Create nice bluegray "base"
rect(par("usr")[1], par("usr")[3], par("usr")[2], par("usr")[4], col ="slategray1") 

# Add depth contours
plot(depths, col="slategray3", lwd=2, lty=2, add=TRUE)

# Add US states
plot(states, col="gray", add=TRUE)

# Add receiver locations
points(receiver_locations$X, receiver_locations$Y, pch=20) 

# Add outfall pipe
plot(outfall, lwd=2, add=TRUE)

# Re-introduce pretty boundary around the plot
rect(par("usr")[1], par("usr")[3], par("usr")[2], par("usr")[4], col ="NA") 

# Add fish locations, color-coded by time
points(fish_locations$LON, fish_locations$LAT, pch=20, 
       col=color.scale(fish_locations$numeric_time,zlim=range(fish_locations$numeric_time),
                        col=rev(tim.colors(256))))

The Final Plot

Well Done!

Now we can start looking at some characteristics.
We can answer questions like:

  • Which depth contour is this fish spending most
    of its time?
  • How close is this fish to the outfall pipe?

"Which Depths?"

In depths shapefile, the "field_3" gives the depth in meters
Where -30 == 30 meters depth

library(geosphere)
# Grab random sample (for processing speed)
random_sample <- fish_locations[sample(nrow(fish_locations), 1000), ]

# Figure out the shapefile attribute that is closest to the fish gps point
distance_depths <- dist2Line(random_sample[,c("LON", "LAT")], depths)
distance_depths <- as.data.frame(distance_depths)

# Add depth column to random_sample that gives us the depth value
random_sample$depth <- depths$field_3[distance_depths$ID]

"Which Depths?"

Plot it!

distance_table <- as.data.frame(table(random_sample$depth))
barplot(height=distance_table$Freq, names.arg=rev(distance_table$Var1),
        col="darkorange4", main="", xlab="depth (m)", ylab="frequency")

"How Close to the Outfall?"

Calculate the distance between fish locations and the nearest part of the outfall pipe.

# Dist2Line to get distances
outfall_distance <- dist2Line(random_sample[,c("LON", "LAT")], outfall)

# Convert to data frame
outfall_distance <- as.data.frame(outfall_distance)

# Add distance (m) to the random sample data frame
random_sample$outfall_distance <- outfall_distance$distance

"How Close to the Outfall?"

If this fish was using the outfall area a lot, we would see a distribution of distances that looks like this.

# Just generate some data
expected_hist <- rep(seq(0,2750,250), c(600, 500, 50, 30, 10, 5, 2, 1, 0, 0, 0, 0))

# Plot it
hist(expected_hist, col="gray", xlab="distance to outfall pipe (m)", main="", breaks=12, xlim=c(0,2750))

"How Close to the Outfall?"

How do our data compare?

# Round fish locations into 250 m groups
random_sample$outfall_distance <- floor(random_sample$outfall_distance/250)*250

# Generate expected and observed plots, but don't plot them yet
expected <- hist(expected_hist, breaks=12, plot=FALSE)
obs <- hist(random_sample$outfall_distance, breaks=12, plot=FALSE)

"How Close to the Outfall?"

So, from this, it looks like this individual isn't regularly using the outfall pipe.

Phew! Is there an easier way to mapping?

There always is an easier way.

ggplot actually has some nice features for mapping.

library(tidyverse)
library(maps)
# Get map of the world
world <- map_data("world")

# Create a ggplot map using this layer
worldplot <- ggplot() +
  geom_polygon(data = world, 
               aes(x=long, y = lat, group = group), fill="gray") + 
  coord_fixed(1, xlim = c(-119, -117), ylim = c(33,34)) +
  xlab("Longitude (DD)") + ylab("Latitude (DD)")

# Plot it
# worldplot

Phew! Is there an easier way to mapping?

Phew! Is there an easier way to mapping?

Let's add some fish data and re-plot it.

# Add our fish data
layers <- worldplot + 
  geom_point(fish_locations, mapping = aes(x=LON, y=LAT), 
             color="darkorange", alpha=0.5)

# Create theme so background is blue like the ocean
new_theme <- theme(panel.background = 
                     element_rect(fill = "slategray1", colour = NA), 
        panel.grid.major = element_line(colour = NA),
        panel.grid.minor = element_line(colour = NA))

# Plot it
# layers + new_theme

Phew! Is there an easier way to mapping?

Phew! Is there an easier way to mapping?

  • ggplot is a lot like matplotlib in python
  • Lots of little steps to create a nice plot
  • But it gets pretty tricky when you're trying to add your own shapefiles or trying to have a very fine-scale spatial resolution
Leaflet for R is also so cool!

Quick Recap

Mapping in R can be difficult, but geospatial data can answer some very interesting questions.

Once you get the hang of it, mapping in R can be very rewarding!

Thank you!














Twitter: echelle_burns